Advice Generation from Observed Execution: Abstract Markov Decision Process Learning

نویسندگان

  • Patrick Riley
  • Manuela M. Veloso
چکیده

An advising agent, a coach, provides advice to other agents about how to act. In this paper we contribute an advice generation method using observations of agents acting in an environment. Given an abstract state definition and partially specified abstract actions, the algorithm extracts a Markov Chain, infers a Markov Decision Process, and then solves the MDP (given an arbitrary reward signal) to generate advice. We evaluate our work in a simulated robot soccer environment and experimental results show improved agent performance when using the advice generated from the MDP for both a sub-task and the full soccer game.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Learning Approach to Knowledge Acquisition

This paper concerns knowledge acquisition for supporting therapy decision making (TDM) within the formal setting of Markov decision processes (MDP's). It presents a method for learning from medical databases high-level transitions, transition probabilities and action rewards. The method is based state comparison. We also discuss insights in terms of what/when/why/how expert advice is needed to ...

متن کامل

Rehearsal Based Multi-agent Reinforcment Learning of Decentralized Plans

Decentralized partially-observable Markov decision processes (Dec-POMDPs) are a powerful tool for modeling multi-agent planning and decision-making under uncertainty. Prevalent Dec-POMDP solution techniques require centralized computation given full knowledge of the underlying model. Reinforcement learning (RL) based approaches have been recently proposed for distributed solution of Dec-POMDPs ...

متن کامل

Concurrent Markov Decision Processes for Robust Robot Team Learning under Uncertainty

For robots to become a more common fixture in private and public industries, they must exhibit compliant individual and social learning. To achieve social compliance, while maintaining individual performance, robots must represent knowledge accurately in both certain and uncertain environments. Robots also need to quantify effective decision making both when isolated and when teamed with peer r...

متن کامل

Proactive scheduling in distributed computing - A reinforcement learning approach

In distributed computing such as grid computing, online users submit their tasks anytime and anywhere to dynamic resources. Task arrival and execution processes are stochastic. How to adapt to the consequent uncertainties, as well as scheduling overhead and response time, are the main concern in dynamic scheduling. Based on the decision theory, scheduling is formulated as a Markov decision proc...

متن کامل

Errata Preface Recent Advances in Hierarchical Reinforcement Learning

Decision Making, Guest Edited by Xi-Ren Cao. The Publisher offers an apology for printing an incorrect version of the paper in the special issue and renders this paper as the true and correct paper. Abstract. Reinforcement learning is bedeviled by the curse of dimensionality: the number of parameters to be learned grows exponentially with the size of any compact encoding of a state. Recent atte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004